Modeling Customer Lifetime With Dynamic Customer Feedback Information

New Perspectives in Business and Econometrics

Alexander Kulumbeg

Marketing Institutes MCA & RDS

Daniel Winkler

Introduction

Story

  • Subscription businesses are very popular (McCarthy and Fader 2017)
  • Contractual setting - curated shopping
  • Nation-wide apparel subscription box service provider
  • Female customers only


  • Monthly surprise boxes with clothes selected by a stylist (person)
  • Option for customer to approve or change something in the box
  • Once received - rating of each item by categories and with optional written feedback

Story II

Ideas


Problem

  • What is hiding in the dynamic feedback (e.g., emotionality, eloquence, engagement…)?


  • How do these components influence the risk of customer attrition?


  • Can we identify other (latent) time-varying signals that affect customer lifetime?

Data

  • Information on
    • Orders
    • Feedback
    • App usage
    • Customer journey
    • Style preferences
    • Stylist performance
    • Previews of Boxes
  • ca. 57,000 unique customers
  • ca. 260,000 transactions
  • ca. 1,050,000 feedback items
  • Distilled into a box-level dataframe
    • User demographics
    • User contract length
    • User lifetime spending
    • Box-level feedback variables
      • Word count
      • Sentiment
      • Eloquence

Data II

Data III

Statistic N Mean St. Dev. Min Max
Customer Age 55,046 37.561 8.279 18 98
Customer Lifetime Spending 55,046 1,091.788 1,232.202 0.000 15,054.740
Contract Length (Days) 55,046 326.063 198.436 28 882
Contract Length (Months) 55,046 10.242 6.521 1 29
Feedback Word Count Per Box (Sum) 55,046 46.186 79.167 0 2,536
Feedback Sentiment Per Box (Mean) 55,046 0.013 0.100 -1.414 1.080
Feedback Eloquence Per Box (Mean) 55,046 0.080 0.095 0.000 1.000

Model

Causal Model

Model Details

A Bayesian Model for Time-Varying Parameters


A piecewise exponential model for lifetimes.

  • Given set \(\mathcal{S}=\left\{s_{0}=0, s_{1}, \ldots, s_{J}\right\}, s_{0}<s_{1}<\cdots<s_{J}\) partitions the time axis into \(J\) intervals \(\left(s_{0}, s_{1}\right], \ldots,\left(s_{J-1}, s_{J}\right]\)


  • Hazard within interval is constant

\[ \lambda(t|\boldsymbol z_i; t \in (s_{j-1}, s_j]) = \lambda_{ij} = \exp\left(\beta_{0j} + \sum_{k=1}^{K} z_{i k} \beta_{kj}\right) \]

Piecewise Exponential Model

Evolution of the \(\beta_{kj}\)’s

As in Hemming and Shaw (2002), Gaussian random walks with initial state \(\beta_{k 0} \sim \mathcal{N}\left({\beta_{k}}, {\theta_{k}}\right)\) are considered: \[ \beta_{k j}=\beta_{k, j-1}+w_{j}, \quad w_{j} \sim \mathcal{N}\left(0, {\theta_{k}}\right). \]

’’

Priors on Innovation Variances and Initial Value Means

Triple gamma priors (Cadonna, Frühwirth-Schnatter, and Knaus 2020)1 are placed on both \(\beta_k\) and \(\theta_k\). Name stems from the fact that, when used for variances, it has a representation as a compound distribution consisting of three gamma distributions:

\[ \begin{aligned} \theta_{k}\mid{\xi}_{k}^{2} \sim \mathcal{G}\left(\frac{1}{2}, \frac{1}{2 \xi_{k}^{2}}\right), \quad& \xi_{k}^{2}\mid a^{\xi}, \kappa_{k}^{2} \sim \mathcal{G}\left(a^{\xi}, \frac{a^{\xi} \kappa_{k}^{2}}{2}\right), \\ \kappa_{k}^{2} \mid c^{\xi}, \kappa_{B}^{2} &\sim \mathcal{G}\left(c^{\xi}, \frac{c^{\xi}}{\kappa_{B}^{2}}\right). \end{aligned} \]

The first stage conditional prior implies the following first stage conditional prior on \(\sqrt \theta_k\): \[ \sqrt \theta_k | \xi_k^2\sim \mathcal{N}\left(0, \xi_k^2\right) \]

Adding a Factor (?)

To account for unobserved heterogeneity in the data, a grouped factor component can be added to the hazard rates. Let observation \(i\) belong to group \(g\), with \(g \in\{1, \ldots, G\} .\) Then the hazard rates look as follows: \[ \lambda_{i j}=\exp \left(\phi_{g} f_{j}+\beta_{0 j}+\sum_{k=1}^{K} z_{i k} \beta_{k j}\right), \] where \(f_{j}\) is allowed to vary over time according to a zero-mean stochastic volatility law of motion1: \[ \begin{aligned} f_{j} & \sim \mathcal{N}\left(0, e^{h_{j}}\right), \\ h_{j} \mid h_{j-1}, \phi_{f}, \sigma_{f}^{2} & \sim \mathcal{N}\left(\phi_{f} h_{j-1}, \sigma_{f}^{2}\right),\\ h_{0} & \sim \mathcal{N}\left(0, \sigma_{f}^{2} /\left(1-\phi_{f}^{2}\right)\right) . \end{aligned} \]

Results I

’’

Results II

’’

Results III

’’

Results IV

’’

Conclusion

  • Modelling requirements:
    • Time-varying covariates with time-varying coefficients
  • Preliminary findings:
    • More eloquent leads us to believe that the customer is not about to churn
    • However, longer feedback has the opposite effect

Discussion

  • Should a latent factor be incorporated into the model?
  • Identification of causal relationships?
  • Focus on prediction vs. explanation?

References

Cadonna, Annalisa, Sylvia Frühwirth-Schnatter, and Peter Knaus. 2020. Triple the Gamma—A Unifying Shrinkage Prior for Variance and Variable Selection in Sparse State Space and TVP Models.” Econometrics 8 (2): 20.
Fader, Peter S., Bruce G. S. Hardie, Yuzhou Liu, Joseph Davin, and Thomas Steenburgh. 2018. How to Project Customer Retention Revisited: The Role of Duration Dependence.” Journal of Interactive Marketing 43: 1–16. https://doi.org/10.1016/j.intmar.2018.01.002.
Gamerman, Dani. 1991. Dynamic Bayesian models for survival data.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 40 (1): 63–79.
Griffin, Jim, Phil Brown, et al. 2017. Hierarchical shrinkage priors for regression models.” Bayesian Analysis 12 (1): 135–59.
Hemming, Karla, and Ewart Shaw. 2002. A parametric dynamic survival model applied to breast cancer survival times.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 51 (4): 421–35.
Hosszejni, Darjus, and Gregor Kastner. 2021. Modeling Univariate and Multivariate Stochastic Volatility in R with stochvol and factorstochvol.” Journal of Statistical Software 100: 1–34.
McCarthy, Daniel, and Peter Fader. 2017. “Subscription Businesses Are Booming. Here’s How to Value Them.” Harvard Business Review, 1–6.
Naumzik, Christof, Stefan Feuerriegel, and Markus Weinmann. 2022. “I Will Survive: Predicting Business Failures from Customer Ratings.” Marketing Science 41 (1): 188–207. https://doi.org/10.1287/mksc.2021.1317.
Netzer, Oded, Alain Lemaire, and Michal Herzenstein. 2019. “When Words Sweat: Identifying Signals for Loan Default in the Text of Loan Applications.” Journal of Marketing Research 56 (6): 960–80. https://doi.org/10.1177/0022243719852959.
Schweidel, David A., Peter S. Fader, and Eric T. Bradlow. 2008. “Understanding Service Retention Within and Across Cohorts Using Limited Information.” Journal of Marketing 72 (1): 82–94. https://doi.org/10.1509/jmkg.72.1.082.
Umashankar, Nita, Kihyun Hannah Kim, and Thomas Reutterer. 2022. “EXPRESS: Understanding Customer Participation Dynamics: The Case of the Subscription Box.” Journal of Marketing, 00222429221148978.
Wagner, Helga. 2011. Bayesian estimation and stochastic model specification search for dynamic survival models.” Statistics and Computing 21 (2): 231–46.